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1. introduction to Epidemiology 


vt A 


(1.1 What is ‘epidemiology’? 


Av) \! 
{ £3 ¥ ine oF 
Epidemiology is the study of the distribution and determinants of disease frequency in 


human populations. 


This definition is based on 2 fundamental assumptions: 


** Human disease does not occur at random. 


° Caus ' : : 
** Human disease has usa and preventive factors that can be identified through 
Systematic investigations of different populations in different places at different — 


times. 


The application of epidemiology is therefore to control health problems in a defined 
population. This emphasizes that epidemiologists are concerned not only with death, 


illness and disability but also with positive health states and ways of improving health. 


1.2 What are the aims of epidemiology? 
+. To describe the distribution and magnitude of health and disease problems 
in human populations 
oT identify the causes/correlates (related factors) of disease 
- To provide data essential for planning, implementation and evaluation of 


services for prevention, control and treatment of diseases and to prioritize those 


services 


1.3 Measurements in epidemiology. 
i describing the distribution and magnitude of health and disease problems in a 
population, certain measurements are required. They are, | 

* measurements of mortality— yo of de pied 

* measurements of morbidity —clisease aPpear, 1% PPS lation 
* measurements of disability _ c\issabugh PepPulatyorur 
° measurements of natality °° oF Ny ; | 
. measurements of the characteristics or attributes of a disease 


* measurements of medical needs, health care facilities and its utilization 


to disease 
ements of environmen tal factors related 
measur 


+ measurements of demographic variables 
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1.4 Types of measures used it epidemiology 





res, which quantify the 


é; . ; mecasu 
Near year ; ade using different / 
Measurements in epidemiology are m : measure of discase 


basic 
Occurrence of disease or health related events. The most 


frequency is a COUNT of affecti:d individuals. 


7 : were 
During an epidemic of dengut: haemorrhagic fever in 2007, 150 new cases 





_ reported from area A and 85 cases from area B. 


Does this information help you to decide where it would be safer to live in? 


The answer to this question is NC). 


We cannot draw any conclusicns as we do not know the size of the population in 


the two areas, from which these tases were reported. 


A count alone has limited value in describing the “problem” in a community. 


The count needs to be related to the size of the source population in which the | 


_ | : 
“ree ere tee ahr ticiah i a 


Let us now see the information presented in Table by 







‘>. 1. cases or events occurred. 





Table 1: No. of dengue haemorrhagic fever cases and population in areas A and B 


Location No. of new cases Population 
Area A aie es 37,500 


emetic 


Although many cases were reported from area A, it is noted that the Population of this 





area is five times that of area B. 
In area A, 150 out of 37,500 got dengue i.e. one out of every 250 persons. 
In area B, 85 out of 7,100 got dengue i.e. one out of every 84 persons. 
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7 dengue cases in the total po . 
have taken the PROPORTION on ee: P Pulation of 
eh be expressed as a PERCENTAGE. For example, | 


Here, 
area. This proportion can also 


Area A: ‘ts total population = 150 / 37,500 x 19 
The percentege of dengue cases among selena = 0.4% : 


Area B: . ion = 
The percentige of dengue cases among its total population = 85 / 7,100 x 100 
) : : = 1.2% 


It appears that area A is better to live in than area B. 


However, the picture would be different if the 85 cases in area B occurred during a Perio; 


of one year, compared to 150 cases occurring in area. A over a period of 4 months. 


Now, calculate the percentage of dengue cases in areas A & B over one year a 


comment or; your findings. 





or 
Be eu car | 
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Se eet es SS OO Oe ee Oe Oe oe es eee ot noe oe ow oe oe ow oT ow a en OS ee ee OS ee 5 
ee Bo over one weary on OP 
ds CRD a colt 5 ok 
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Den ee car 0 ah Fe nie ee gre ne eee 
I i Fann Onan NG. Opplommalely equal in 2 oreas 


Oe ee ee ee ee 
= 
OS Re a ee eee SS 6 ee OS oe a OP ee Oh ee Ow ee ee a ee oe 


7 Om et ee oe ee 
“oe OS ae OO Oe ae ee Oe Oy Ome Oe eee Se One Os ot 
‘ 
SO ee Oe SS Se SD Te OD es eee ee cee See es 
On Oe ee ee ee ee ee ne ne ne 
Oe ee ee es ee 
7 oe et ee eee ee oe ee es ee ee ee 
el eee i ee | 


What does all this mean? 






When measuring dj 

asuring disease or a health-related event in a community, it is necessary to 
Use a Meésure that wil! , 
i take into account the number of c 


ases, the population in whici: 
“uelLeases occur and the time period during which the c eC Te a 


ases occurred. 






RATE is Such a measure of disease frequency. 


— 1 PY 
pre Pt 


R ati? 
Community Stream, F/M, Colombo 


es 
© A rate measures the occurrence of a given cvent in a defined population during a given 
period of time. Rates are therefore calculated by expressing the number of events 


(numerator) as a fraction of the population (denominator), in which the event occurred. 








No. events occurring in a given population 
during a given time period: Or ) 






x K 








Rate = 






No. of subjects in that population in which the event occurred 


(al) 





The rate comprises of the following: 
cen Ee 


‘Dd ° spsurmmerafor, 
QD A denominator, | | ae tyo n) 


G)* time specification during which the event occurred and 
(&) © aconstant K, which is usually a multiple of 10. 


You have learnt about rates when you studied demography. 
Please read up the module on demography. You may recall that we followed the same 
principle i.e. the numerator being the number of events and the denominator being the 





population in which the event can occur and the time period, 





. RATIO is another measure of deena pany It expresses _a relation in size between 


two ‘counts’ obtained by simply dividing one quantity by another. 
———— 


For example, male: female ratio in a defined population is given by the number of males _ 


to number of females. X: Yor X/Y 





—/ a a 
You should note pain that the numerator is not a component of the denominator. 
LL itt tN 


pie think of other ratios that you know of. 
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@ a, 
What are the measures of disease frequency that you learnt so far and how do these , eas 
Hangs. 


differ from each other? Rate » Proportion, Percentage, count. Ratid. 
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QI. The following information is given for district A for the year 2007. 
Total population 1.7 million 
Geographical extent 652 sq. kilometers 
Male population 51% 


Urban population ~ 1.2 million 
A survey carried out in the district revealed that 777,240 men were able to read and write. 


Calculate the following: | | par harvet wane how many 
eas aa 
1. Population density 2. Sex ratio. fens: 
3, Proportion of urban population 4, Literacy rate for the male population 
|. Popularion density > LEO’ = 2607 par em* 10 suahsct AL 
ae oer; | 652 i 
ee Te Rae eT ee 
Ey Oe) Path aaebeae stealer a 
3. Proporttur of urban populatens I2/IF 2 o Fos 
en en erry en ere > ee Sree Se RE a = 


——e ee eee ee eee 


-64 19° 
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4 PMete--popldafien tee 1-7 wor, = 8:67 ™ 
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households had no toilet 


s and from wells. A total 


nfant deaths. Oft 
eaths 


Q2. This area had 273,000 households, of which 29,000 
facilities. The water supply to this area was through roadside ‘AP 
of 14,450 deaths were reported {irom the area, of which 1,068 were ! 
infant deaths, 75% occurred during the first month of life. Twenty pe 


hese 


in the area were due to diarrhoea. 


ga proclaimed area 
ered. A total 


in , 
Iso regist 
rted during this year. 


3 , ce ; f 
A total of 42,700 live births wer recorded in this area. This area 
where the Registrar of Births is medical person, still births were a 
of 700 still births were registered. There were 5 maternal deaths repo 


— 
~~, 
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Sane 2.1 Calculate the following rates using the above data ree Pees 
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~<__ Total popanincan = 
ee 






_ — C . 
P Alan Oo => 2. Crude birth rate 





a a we r as 2 ae 
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QS. There are 12 Medical Officers of Health; 226 Public Health Midwives: and 26 Pudi 
Health Nursing Sisters in this area. A total of 1,165 Medical Officers and 3.165 Nurse 
work in the 25 hospitals in the area. During year 2007, 10,000 beds Were available any 
500,000 pacients were treated as inward-patients in these hospitals. OPD attendance Wy 


3,940,000, 


Calculate the following: 
1, Availability of health personnel on a population basis 
2. Hospitals per population 
3. Beds per population | | 
p. ones! cai CAS! I ARS ie ne Pei: 
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Z. Prevalence and Incidence 


The measures of disease frequency most commonly used in epidemiology fall ints two 


broad categories: 
« Prevalence 


(\) see" e / 
free gu cits . _® Incidence 


Measuring prevalence and incidence depends on counting ‘cases’ in a defined population 


within a defined time period. 


The first question we must ask ourselves when we attempt to count disease is, 


“Who is a ‘case’?” j.c. the case definition. 


This may sound very simple but let us look at the following example. 


Cholera can exist in the population in the following manner: 





Typical ‘rice 


water’ stools 


Mild watery 
ciarrhoea 





Stools Few loose 
demonstrate stools 
Cholera vibrio 










Needs 
hospitalization 


May/ may not 


No signs and Person does not 
seck treatment 


symptoms take much 
notice 





Which of these would we count as a ‘case of cholera’? 


Any Stoge can pe taken 25 & cose 
tre what We arc 1oking. 


It is very clear that before we start quantifying disease, we need a clear definition of a 
‘case’ to identify ‘disease’ / ‘healthy status” in a person.-The definition should be clear and 


unambiguous. 






In epidemiology, it is essential that the definition of ‘case’ be clearly stated. | 


4 






It should be easy to use and easy to measure in a standard manner under a wide variety 


of circumstances by different people. 








The next question we much ask ourselves when we attempt to count discase is, 


“Do we have a correct estimate of the population in which a 'case' could occur?” 


Ideally, this source population should include only persons who are susceptible to that 
illness, For example, in the case of food poisoning, only those who consumed the infected 
food will form the population at risk. The section of the population that is susceptible is 
called the population at risk. Sometimes, information on the population at risk is not 


always available and in such a situation, the total population is used as an approximation. 


aK 
2 ‘ain | 
In a visual examination of patients for identifying the prevalence of cataract, 


whom do you consider as ‘cases’ and ‘population at risk’? 


cece tt in heal et ee ee oe 


have - undergone Ca, 


2.1 Prevalence 


In quantifying disease in a defined population, we can count all individuals who have 


the disease at a given point in time. This number in the defined population is called the 


prevalence. 


Prevalence is a measure of disease burden in a community-/ population as it gives an 


idea of the disease status in a defined population at a given point in time. 


No. of existing cases of a disease or condition 





at a specified point in time 


Prevalence (P)= — ee 






_Population-at-risk in the 


defined population at the same point in time 





K = multiple of 10 


is | int 
Since we are referring here fo aA apecified point in time, this ts also called po 
% wie . 


, riod. 
prevalence. In this instance, prevalence does not ag such involve a time pert 


Therefore, by strict definition, pley ovalence is | is a p 
ar 
the point can also refer to A sper fic point in calendar time such as, per week, per | ye 


roportion | and not not a rat rate. However, since 


A a aa 








Pens Mah eg ery eT 


oto, His. also called a prevalenci rate. 
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2.2 Incidence 


Incidence of a disease Is the new cases of a disease that occur during a specified period 
rE 


2. ap 


| ai afl 





of time in a defined population. 


ee ee 


Incidence rate is defined as the rumber of new cases occurring in a defined population-at- 


risk during a given period of time. 














No. ol new cases of a specified disease 


diiring a given period of time 





Incidence Rate = 
(TR) Populaltion-at-risk during that time period 





K = multiple of 10 


It can also refer to new spells «ir episodes of illness that occur during the given period of 
time. For example, if a person suffers from a common cold twice during the year, there 


would be 2 spells of sickness in that year, 











No. of ew spells or episodes of a specified disease 


| during a given period of time 
Incidence Rate = = ————.-—__________ eel cK 
(Spells) . Population-at-risk during that time period 





K = multiple of 10 : 


incidence measures the rate at which cases occur in a population. [t is not influenceg b 


il 


the duration of the disease, 
a — ——_—__ eee 


(fa count of all cases i.e. old cases plus new cases that occur over a period of time (j, 


the total nuraber of persons who are known to have had the disease at any time during , 
specified period of time) is taken, a period prevalence can be calculated. ‘7h, 


denominator used for this calculation is the population-at-risk midway through th, 


defined period of time, 


No. of existing cases of a specified disease | 
at the beginning of a -given period 7 

| | 

| 


| 
| 
oe . 
No. new cases diagnosed during the same period 


| a ae see mama enna anit eeh ames ee 
Estimated at-risk population at ‘mid’ time interval 





K = multipl: of 10 


This measure combines both point prevalence (status at a single point in time) and 


incidence (risk of developing disease over a period of time) in a single parameter. 


This method is useful and convenient when it is particularly difficult to determine when 4 


disease can be considered present. Such as in the diagnosis of mental illness. 


cy 
o In period prevalence, what are the two disease frequencies that are combined in a 
single parar eter. | 


_.Peint  prevaten R  iNctoence ; : 
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Figure 1 illustrates 10 persons at the Beginning and ending of = disease doting « given 


time period 
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case 10 








Jan.01 mae 


-- Time followed 





® = Onset of a disease - 
Point prevalence on Jan.01 = 1,2,5,7,10 

Point prevalence on Dec.31 = 1,3,6,9 

Incidence (Jan 1-Dec 31) =3,4,6,8,9 

Period prevalence ( Jan 01 — Dec 31 ) = (1,2,5,7,10) + (3,4,6,8,9) 


[asa] 
In a study of oral contraceptive (OC) use and bact 


49 years were identified as free from bacteriuria. Of 
of the first survey in 198). At the second survey in 198 


criuria, a total of 2390 women aged 16- 
these, 482 were OC users at the time 
8, 27 of the OC users had 


developed bacteriuria. 


What is the incidence rate of bacteriuria among OC users? 
12 





IR = 27 cases of bacteriuria among 432 OC users 


= 27/ tg or 5.6 percent bacteriuria among OC users within 3 years 


The time specification is important since 5.6 percent in 6 months or 1 year is quite 
different to 5.6 percent in three years. IR assumes that the entire population-at-risk at the 


beginning of the time period (in this instance, 482 OC users) has been followed up for the 
three years. 


More about incidence... 


we have been discussing so far is also called a cumulative incidence 


The incidence that 

(CI). Our counts of the new disease episodes are those that have been accumulated gyet4 
defined period of time. Thus, cumulative incidence provides an estimate of the likelihood 
uring the specified period of time. 


or risk of an individual developing the disease d 

Here, we assume that the total group is available throughout the study period, hence we 
are able to identify disease occurrence for the whole group within the total duration. 
this may not always be so. For example, if we have a group of 1000 persons in 
studied during a period of one full year, some of them (e.g. 
only for 6 months, that is for 0.5 years, whereas the other 


However, 
our study group, who will be 
250) are available in the area 
750 are available for 1 year. 
We have one of two options: 
1. To leave out those who were no and calculate 
sncidence based only on the 750 persons or 
2 Find a way to use the data related to the total gro 
We can calculate the persou-time during which each member 
» for 0.5 years and 750 for one year 7 


t available for ‘one full year” 


up. If so, how do we do this? 
in the group was followed 


up. i.e. 25 
is = (250 x 0.5) + (750 x \) 


Therefore, the total person-time of observations 


13 















Aas you can see the person-timr of observation can be used to account for the varying time 


periods of follow up. Using this, we could now calculate another type of incidence called 
‘incidence density’. 


. 4 No. of new cases of a disease during a given period 
Incidence Density = : , 


Total person-time of observations 


K = multiple of 10 


Let’s look at the example given in Figure 2. 


Figure 2 2 shows that different subjects A - E have been followed up for ~_ periods of . 
time. ‘ 


Jan'56 Jay96 Jan'97 July'97 | — July'98 Jan'99 Suly'99 Jan' 2000 
A O-= 
B ‘ow 
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A-E subjects considered () 
_ Time followed 
On set of the disease @ 





For example, subject A. was followed wp fh 3 yee aie 8 ie de 
a __ Total number of person-years = A 3;B -4;C -3;D- Ts E-3=14 omen years 
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_ The most ccmmonly used incidence rates are: E 


1. Morbidity Rate: 
Is the incidsnce rate of non-fatal cases in a total population at risk during a specifiy 


period of tirne. 







No. new cases of non-fatal disease 





Morbicity rate = 
Population-at-risk during a specified period of time | 


K = multiple of 10 — 


2. Mortality Rate: | 

Is the incidence rate of fatal cases (deaths) in a total population during a specified period 
. 4 2 

of time. | 


No. of deaths from a disease 
Mortality rate = — : x kK 
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3. Case Fatality Rate (CFR): 

CFR. measures. the number of deaths that occur from a specific illness in a group of 

patients suffering from that illness in a given period of time. It represents the proportion 
of fatal cases amiong affected individuals and is often expressed as a tage, 






No. deaths from a disease 


in a given period of time 


Case Fatality Rate (CFR) = ——__ . se 


No. diagnosed cases of that disease in that period 
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Case fatality rate is an indication of the severity of an illness. It is different to 
specific mortality rate, which expresses the number of deaths from a particular disease 


among all individuals in a population during a given time period. 


In 1995 in Sri Lanka, the total number of cases admitted to Government hospitals with 
poisoning by medicinal agents was 2590 and deaths due to the same cause were 47. 


47 X 100=1.8%. 


Case fatality rate for poisoning by medicinal agents, = 
| 7 : meh 


i.c. 18 deaths per 1000 cases of poisoning by medicinal agents in Sri Lanka in 1995 — 
‘ : \ . : | 





It should. be noted that the deaths that make up the numerator in CFR do not necessarily 


represent | the cases that make up the denominator.. ; 
Can you suggest an example to support this statement? se see 2 he. Bio m 
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4. Attack rate (AR): 
mec recalinnseniel Bet | 

AR is a measure of occurrence of new cases of 

short period of time, as in an outbreak of a dit 


Attack Rate (AR) = 


Population-at-risk n| 
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Exercise 1 
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county in New 
Ql. An irvestigation of an outbrvak of gastro-intestinal iliness in Laswego 


York revealed that 37 out of 60 people who had attended a dinner became ill w 


ithin a few 


hours. Given below js an epidemiological analysis of this out . ute- 
Table 1, Characteristics of persons: in Laswego county during an out s 5 


2astroenteritis 
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1.1 Calculate the total attack rate, 


1.2. Calculate the attack rate by sex. ; 


(-] Toke atta, Yote = —SQelOp | ate and those who do not eat each food item. 


—lelel.attter Vote = “a | ; 
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2.1 Calculate the morbidity, mortality and case fatali 
22 Comment on the trends of shi gellosis. } 
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_Q3. Figures 3 and 4 represent the follow up of 7 subjects with exposure to a risk factor and 


the time taken for the development of a disease. 


Figare 3. Outcome of a follow up of 7 exposed person-population A 
i ea 
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Figure 4, Outcome ofa follow-uy of 7 exposed: persons in population B 
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3.1 How many new cases occurred in populations A and B? 


aA 
PB G ” - yx =. 
3.2 Calculate the person-time of exposure in o. for: ‘: 7 «fg de} 
co } 2 
2 d+ ls aJ# 24s + ad +2 Dives. 


Population A 
population BRB er 2a% Fh ade = | 


3.3 Calculate tae incidence density (Incidence rate) for: 


A (ow - ae , 
Population A... aT Sn aoe, 
2 ag 5 4 
Population B........ ee at 
Cen | 
3.4 Comment on the findings. 
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Q.4. In a 5 year jollow up study of post menopausal hormonal use and coronary heart disezy 
(CHD), 90 case: with CHD were diagnosed among 32, 317 post menopausal! women » 
Hormonal Replacement sua cd (HRT) vaesd a total of 105,786.2 person years of follow 


up, 


4.1 Calculate the iacidence rate and incidence density among the participants in this study. 
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PL note: It is assumed that risk of CHD in the group remains constant over time. This may 
not 

be true since individuals have varying degrees of risk. The ways of ‘overcoming this 
Problem will be dealt with later. | 
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4. Epidemiological study designs 


Epidemiological study designs are of two types, They aro 
NS as emputieadlia aes ; 
° observational studies 
© experimental studies 
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3.1 Observational dudios> 
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Observational studies are those in which data are gathered simply by observing events as they 


happen, without the investigator playing an active role in what is taking place. 
7 


nn 


Observational studies allow nature to take its course; the investigator observes 


and measures but does not intervene. 


There are two types of observational studies: 
i © Descriptive studies 


3.1.2. Analytical studies 









3.1.1 Descriptive studies 









iscase OF health 
Descriptive studies focus 0” describing the occurrence of @ 4 
person, place and time. 


related event in & population in terms of 





Community Stre 





3.2 Experimental studies 


In experimental studies, the investigator ‘intervenes’ (makes a change) in one or | 







more variables in a group and ‘does not intervene’ in another group. This 


‘change’ introduced by the investigator may vary. 


Examples: introducing a new drug, introducing a training programme. Here, the study 


attempts to find out whether the intervention had an influence on a defined outcome. 


Therefore, experimental studies are also called intervention studies. 


Given below is a summary of what we have discussed up to now. 
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Case control 


Correlational 


Case report 
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Cross-sectional 
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3.1 Observational studie: 


3.1.1 Descriptive studies | 
As the name suggests, these studies describe the amount and the distribution of disease 
patterns in populations. To describe the occurrence ‘of disease fully, some questions have 
to be answered: | | 

e who (person) get the disease? 

e when (time) do cases occur ? 


e where (place) do cases occur ? 


In other words, the description has to tell us the person, place and time in respect of a 


disease occurrence. 





Person: Characteristics of a person such as age, sex, religion, marital status, socio-economical 
factors, etc. can furnish different types of clues about the pattern and possible etiology of a 


disease. 


Let us work on Exercise | that is given below to illustrate the above statement. 


Tables 1-5 below are from a study carried out in the Gampaha district to describe the 
prevalence of hepatitis B infection. Evidence of infection was determined by testing for 
the presence of hepatitis B surfisce antigen (HBsAg) in samples of blood obtained from a 
random sample of 1913 persons iiving in the district at the time of survey. 


Table 1. Distribution of HBsAg by gender 
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Table 2. Distribution of HBsAg status by age 


“Ageinyears| ~  HbsAg HBsAg 
ositive negative 


Od 
5-9 
10-14 
15-19 
20-29 
30-39 
40-49 
50 and over 


Total 








Table 4. Distribution of st Status by hen characteristics 


) Family characteristics HBsAg Total 
postin negative 


Number of p=rsons 5 or below 915 930 
Number of persons more than 5 983 
Number of children 0-2 — a 521 
Number of children 3 or more 44) 481 


Table 5. Distribution of HBsAg status by socio-economic status. 
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foes . 336 : 340 
Middle iS Eee 1317 
Low | 243 256 


Total 48 18651 + 74913 
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Place: Frequency of disease can be related ta place of occurrence in terms of areas described 


by natural boundaries such as rivers, mountains or desserts or by political 
boundaries such as districts, Grama-Niladhari divisions. Characteristics of the physical and 


Diological environment of an area may cause certain diseases to be more common in that area. 


Comparison of disease frequency in relation 10 place can also be made between countries or 
between regions within a single country, For example, mortality from colon cancer is much 
lower in Japan than in USA (Ref. Doll R and Peto R. The causes of cancer. New York: Oxford 





University Press 1981), 

cy 

ey Think of possible reasons for this difference. | | 
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Exercise 2 


Table 6 is reproduced from a report on the national nutritional survey carried out in 1988/1989 in Sri 


Table 6. Nutritional status by districts in SL during 1988/89 





Colombo 
Gampaha 0.7 
Kalutara 1.2 
Kandy 4.6 
Matale . 5.4 
Nuwara-Eliya cA. 
Galle 3.6 
_ Matara 7 2.8 
Kurunegala 0.8 
Puttalam ! 0.4— 
Anuradahapura 4.4 
Polonnaruwa 2.9 
Badulla 3.8 
Moneragala je: 
Ratnapura . 22 
Kegalle 3.9 
5 
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Source: Report on the National Nutritional Status Survey in SL, 1988/89 


Describe the district differences in nutritional status (Use extra paper). 


. Mair itior nalnutritiox : ition 


2.1 


Time: Study of disease occurrence by time. is a basic aspect of epidemiology. 
Occurrence is usually expressed on a monthly or annual basis. Three kinds of change with time 
are described. | , 4 | 
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trends. This implies changes i" 
r a long poriod of 
nka 
time, generally over several years 0 


from 1945 to 2003 (as shown in Figure 1) shows the five 
depicting a secular trend. 


| 2003) 
Figure 1. Secular trends in crude birth and death rates of Sri Lanka (1945 0 
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Within these secular-trends, periodic fluctuations may be seen. These could be on an 


annual or on a seasonal basis and are called cyclical changes. A classic example is given 
in Figure 2. 


Figure 2. Distribution of cases of Japanese Encephalitis by months during 1995 
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Short term fluctuations, as seen in an epidemic is another type of Variat 
On 9 


frequencies with time. This pattern is called an epidemic curve. Two example 


below in Figures 3 and 4. 


Figure 3. Distribution of Dengue Haemorrhagic Fever by months during 1995 
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Figure +}. Epidemic curve for an outbreak of F ood Poisoning 
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Descriptive studies use data from many diverse sources. Can you suggest some common 


sources of data for descriptive studies? | 
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Several approaches are used to obtain data for descriptive studies. They are, 


a. Case reports and case series : aN 
b. Correlation studies | 
¢. Cross sectional surveys 
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a. Case reports and case Series 
A case report is the most basic type of descriptive studies consisting of a carefully 
detailed report of a single atient, These describe the experience of a single patient. New 


or unusual collection of individual case reports gives rise to a case series, which describes 

the characteristics of | be “ clp to identify 
| & number of patients with a piven disease. These help to identify 

unusual clinical presentations of a disease and may lead to formulation of a hypothesis 


about a possible cause. 


Routine surveillance programs often use accumulating case reports to suggest the 


emergence of new diseases or epidemics. Let’s look at the following classic example; 





Five young previously healthy homosexual men were admitted to hospital with a 


diagnosis of penumocystis carinii pneumonia in three hospitals in Los Angeles, during 






a six month period from 1980-1981. 









This clustering of cases was striking because previous to this, infection with 
Pneumocystis carinii had been reported exclusively among older men and women 
whose immune systems were suppressed. This Suggested that these young men. were 
suffering from a previously unknown disease which causes immunosuppression, This 
was later called Acquired Immune Deficiency Syndrome (AIDS). The fact that all 
cases were homosexual men also raised the hypothesis that some. aspects of sexual 


behaviour could be related to the risk of acquiring this disease. 
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b. Correlational studivcs 
Unlike in case series, _ data from population groups are | used to compare disease 


frequencies. Comparison: are made using the same population at different periods of time 
or different groups of pojulations during the same period of time. 
te 


Fig 5 refers to the correlation between per capita meat consumption and colon cancer 


among women in various countries, 


Figure 5. Correlation between per capita meat consumption and colon cancer among women In 


different countries 
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ti ” What associations can you derive from this data? 
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Correlational studies are used as the first step in investigating a possible relationship 
between ‘ exposure’ and ‘:lisease’ and formulating hypotheses. Other advantages are that 
they can be done quickly and inexpensively, often using already available data. 


* 9 nes arene “+ + —_. © “ea. = 2 <a ec ~e. f wiley ” ih 
iieemeciied) a1 LR} Abe a 








Ure 5, 
person refers to population Broups 


therefore not used to test hypotheses. Another limitation of correlation 


may account for the differences | in disease frequency. : 
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sectional surveys 
é cra | f descriptive epidemiological desi 
third oe gn is cross sectional survey. 
The srudies are carried out at a single examination in a cross section of the population. 
ese ee 
a6 studies provide information about frequency and characteristics of a disease by 
viding, «snapshot of the experience of a population at a specified period or point in 
x time: 
# : wt information can be of FES Y2%=* public health administrators in assessing the 
A u 
a , si satus and health care needs of a population. Please refer to Exercises 1 and 2 m 
-« € 
a Chapter 3 for such cross sectional surveys. 3 
a In conducting 4 cross sectional study, one must: 
be - haveaclear objective - 
4 . define the study population 


: / +.» give due consideration to using 4 sampling method 


ensure adequate response rates 
‘identify " methods’ of data collection 


carry out appropriate analysis 
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i. What is the measure of disease. freq 


vency that could be calculated in a CTOSS sectional 


study? 
revalence _____.-------------— oo 
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ii, What kind of cases (incident or prevalent) would a cross sectional survey identify? 
__ Prevalent cases Soo aera 
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Exercise 4 


QI. It has been observed in cross sectional studies, that individuals with cancer 
have significantly lower levels of beta carotene than healthy individuals of the same age 
and sex. 


Suggest three possible explanations for this observation. 
: Me mey be a ormtedive fettor fer AN cer. ___ - 
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Q2. Table 7 shows results of a cross sectional survey of coronary heart disease (CHD) 
among male farm workers aged 40 -70 years by their occupational physical activity. 


Table 7. Distribution of coronary heart disease (CHD) among male farm workers aged 
40 -70 years by their occupational physical activity : 


Level of No. examined No. Aaa Prevalence 
physical rate 
meri 
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2.1 Complete column 4 of the table. low paystent “omwhy Wer, 


Wwexers wh 


2.2 Describe and comment on the findings. 4 Wher Kendency to ccguyre = CHD 


2.3 Suggest why age-adjusted prevalence rates have been calculated. 
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Q. 3 Given below are the findings of a cross sectional survey carried out among a sample 
of 600 flat dwellers within the age group 35-45 years. 


Level of physical 3 Diabetes mellitus* . Total 
Absent Present 


Low 305 325 61-S/i¢ 
Medium 171 , 190 tvo/ te 
High 65 85 238 ->/o0 
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3.1 Describe the findings. 
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Q. 3 Given below are the findings of a cross sectional survey carried out among a sample 


of 600 flat dwellers within the age group 35-45 years. 








Level of physical Diabetes mellitus* . Total 

Absent [Present 
Low 305 20 325 6)-5/i@ 
Medium 171 Rete 190 tvu/ 1 
High 65]. 20 85 235 Ho 
Total 54] ae ae 600 
* Based on defined criteria : 
3.1 Describe the findings. 
Pevple bh wing. high  playstcal _cotivity has higher 
a 
3.2 What conclusions could you draw? | | , 
2M _(6 WiQher m Nhgperactive people 


People with DM 'ageae, mpl efcal  Oetiviltes 


3.3 What additional information would you like to have? 
_Powity |cvel before  dtegnosis of Diabetes auedBters 
S W eS | 
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4. Introduction to analytical studies 


Up to now, we have discussed the following epidemiological study designs under observational 
studies: | 


4.1.1 Descriptive studies 
a. Case reports and case series 
b. Correlational studies 


c. Cross sectional studies 


As you have learnt, descriptive studies are very useful to describe patterns of disease occurrence 


with respect to person, place and time and thereby formulate hypothesis about a disease and its 
risk factors. | | 


What is a hypothesis? — 


It is a statement of belief or intention, which one expects to prove OF disprove. This 


statement relates to certain factor/s which cause/s or relate/s to the occurrence of a disease. 


SF Refer to Exercise 1 in Chapter 3 (pages 26-28). In this cross-sectional survey that has been 


carried out to describe the prevalence and characteristics of hepatitis B infection, what 
hypotheses are you able to formulate? | 
Hint: refer to your answer given in 1.3 in this Exercise 
6 f hale hip Bin kecteh 
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OTH ca cacucsvenensenn ener 
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er In the same manner, you can also formulate hypotheses for beset and @ in Exercise 4 in 


: the same chapter (pages 37-38). 
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Let us now take a look at the conclusions that we drew from these cross-sectional surveys. 

For example, in Q3 in Exercise 4, conclusions were: 

e High physical activity levels seem to be associated with the prevalence of diabetes among 
flat dwellers _ 


Or 
e A diagnosis of diabetes may have influenced the flat dwellers to change their behaviour 


from being less physically active to being more physically active 


As you can see, we cannot definitely come to 4 conclusion that low physical activity is 
associated with diabetes. This is a weakness in all cross-sectional study designs as they 
measure both exposure or risk factor and the disease status at the same fi in each 


individual. This makes +t difficult to determine whether the exposure came before the 


disease or after the disease. 





establish causal or temporal relationships 





It is not possible to 
from data collected in a cross sectional time fame. 


If we are to establish such relationships about diseases, we need to have a study design with 
comparison groups either for the exposure or the outcome and work backwards or forwards. 
tures in analytical studies. In the following chapters in this volume, we 


This is one of the key fea 
will study analytical studies - the second type of study designs that come under observational 
studies. : ectiss 
31.2 Analytical studies, } ” 
vestigator begins with a hypothesis an designs the study to 





In analytical studies, the in 
specifically test that hypothesis. This is different to a descriptive study, which does not 
ae 


begin with a hypothesis but its results used for formulating @ hypothesis. The term 
ei _.caENE A aoeeemmal =e LS ' | 
‘analytical’ implies that th use’ of a disease P) 
( looking for associations between dis 





is designed to establish the \ 
ease occurrence and its exposure to & risk factor. 









‘5 to test a specific hypothesis 
-sectional study 









h in analytical studies 


The basic a 
lated based on the findings of a cross 
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i idual in the 
Another feature in an analytical study i study on subject | of interest is t the indivi individual in the seein 
popu sentative 
_ Population. This is different to descri tive studies where the ulation or a n or a represe 


the 
“sample of of that population is the focus of interest. However, as in descriptive studies, 
———— ee ie os t ot 
inferences are made to the population from which the individuals are drawn from, and n 


only to the individuals who are actually studied. 


nad It is important to note that although analytical studies help in testing a 


hypothesis about associations between disease occurrence and its exposure to risk 
factors, this alone will in ino way imply ‘causality’ of a disease. Such a judgement of 
causality can only be made by taking into account all evidence that is available according 


to a set of criteria. These criteria that assist in judging causality of a disease include: 


strength of the association, biological credibility of the hypothesis, consistency of the 
eee Roe tasit ernie erent 


findings, temporal sequence and the presence of a dose- response relationship. 
end ————— rer 


oy Analytical studies have the lollowing features: 


(2) They are used to test hypotheses 
(>) There is alwa $ always a compat ison group 
. The Sear for identifying ithe two groups is: 


- presence/ absence of exposure 





\ 
Mme Co. 
- presence/ absence of ¢lisease 
There are three types of analvtical studies. They are: 
a. Cohort Study 


‘b. Case-control Study 
c¢. Cross sectional design, in analytical study 
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5. Cohort studies 


5.1 Types of cohort studies 


A major tye of analytical study designs is the cohort study. This design is also calle | 
ce f 


follow up or incidence studies. 








‘Cohort’ wes the Roman term for a group of soldiers that marched together. In clinica) 
research, coliort is a group of people grouped together with a common cause or a group of 
persons witli a common characteristic or experience. For example, a group of persons 


born during the same year makes a birth cohort. 






A conort is a group of persons 


who share a common characteristic or an eX; 





Let us now consider a group of people who are free of a particular ‘disease’ under study. 
This group can be further divided into two groups based on their exposure to a ‘potential 
cause’ of this disease i.e. 

e — Group with the exposure to a potential cause 


e Group without the exposure to a potential cause 


In epidemiology, we refer to this particular ‘disease’ as the outcome and the ‘potential 
cause’ as the ‘risk factor or exposure. Both exposed and non-exposed groups are then 
- followed up to see’ whether individuals in each group would develop the outcome. This 
sequence of events makes up a ‘cohort study’ and is illustrated in Figure 1. 


Figure 1. Sequence of events in a cohort study design 


Exposed to 
a potential 
risk factor 








Exclude those with the ‘disease’ 










Persons 
without the 
disease under 
study 







Not exposed 
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hort studies are interested in exposures prece din 
mes, these 


e 0 
— establish the time sequence of events and 
ics “ oe thereby strengthen the inferences 


t exposures as ‘potential causes’ of disease outcomes. 


Ye 


made abo 


there are two types of cohort studies. Of the two, one is called prospective cohort 
udies, in which the s 
e disease (outcome). 


investigator is looking forwards from potential cause (exposure) t 
ure) to 


possibl 


‘s a classic example of a prospective cohort study. 


Given below 


Prospective Cohort Study 


ed the incidence and risk factors for common diseases in 


Health Study examin 


The Nurses’ 
women. The basic steps in performing the study were to 
1976 the investigators obtained lists of registered nurses aged 25 !o 


1, Assemble the cohort.’In 
42 inthe 11 most populous states and mailed them an invitation to participate in the study. 


9. Measure exposures (potential risk factors). They mailed 4 questionnaire 10 obtain | 
‘nformation about their diet and received ‘completed questionnaires from 121,700 nurses. 
They sent questionnaires every 2 years for the next two decades and updated the status of 
their diet measured at baseline. | 


naires also included 


mes. The periodic question 
en confirmed 


cohort and measure oulco 
ease outcomes, which were th 


3, Follow up the 
the occusrence of a variety of dis 


questions about 


by review of medical records. 
dict completely and 


bsequent outcomes. 


h allowed investigators 1 measure exposures on 
alleled 


The prospective approac 
em to collect data on su 


accurately at baseline. The cohort design allowed th 
‘od of follow UP have provided sheng? 
Cc 


The large size of the cohort and extended per 
opportunity to study risk factors for various forms of heart disease, cancer, and 
diseases. For example, the investigators examined the hypothesis that high intake of dietary fibre 
is associated with a decreased risk of colorectal cancer: Fibre intake was 7 
787 cases of colon cancer were confirmed between 1980 and 1994. The rate of colon ee 
PP ts women in the lowest decile of dietary fibre intake wes similar to the rate in yee 
ewes "a eel cara risk * oF oh pea’ 

€ is for potential con ounding factor» : 
the result. The large number of nag sotoh cancer and the ity of the methods suppor 

colon cancer. | 


conc Mig Pe 
lusion that high intake of dietary fibre does not 


Ref: Fy 
ae Giovannucci EL, Colditz GA, 
end adenoma in women. NV Engl J Med 1999;340:169-76- 
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I. What were the exposed and non-exposed groups and the disease outcome/s thet were 
et by the researchers in this study? 
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3. Can you think of any difficulties the researchers faced in carrying out this study? a 
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There is another type of cohort studies, in which the investigator is looking backwards 


from potential cause (exposure) to possible disease (outcome). These are called 
retrospective cohort studies. 


Given below is a classic example of a retrospective cohort study. 





) Retrospective Cohort Study 
To describe the effect of asbestos dust exposure on the mortality of cancers, Enterline anslyzed dats | 
in a retrospective cohort study among asbestos workers. The basic steps in performing the study were to | 
L. identify a suitable cohort The investigators used the social security tax returns filed with the | 
United States Bureau of Internal Revenue during 1948 to 195] to identify asbestos workers who ) 
; retired normally at age 65; those who retired before age 65 for personal reasons but lived to age 
65; and men who retired before age 65 because of a disability but who lived to be 65. 


: 
Collect data about exposures. There was a total of | 376 men in this study who reached age &. ' 
Of them, complete exposure and job histories were available for 1 348. Dust levels ai each job 
site and time period of exposure were expressed as million particles per cubic foot of air 
(mappcf). | 


2. Collect data about subsequent outcomes that occurred at a later time. They collected data on 58 
cancer deaths occurring in this group between 1948 and 1963 from claims filed with the Social 
Security administration and the corresponding death certificates obtained from state health 
departments. . " 

The observed mortality among these men was compared with an expected mortality of the entire US 
white male population living in the time-age intervals thet characterized the retired population. The 
investigators found that men who retired from an industry with asbestos dust exposure had an owcrall 
mortality rate 14.7% higher than all US males. This cacess was duc almost entirely to cancer and 
respiratory diseases. For cancer, the greatest excess was in respiratory system cancers. For respiratory 
diseases, the excess was entirely due to pneumoconiosis and pulmonary fibrosis. 


Ref. Eaterline PE. Mortality among asbestos product workers in the United States. Ama V ¥ Acad 
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1. What were the exposed and non-exposed groups and the disease outcome/s 
studied by the researchers in this study? 6S 
Re Dowd Obeaios Wottes retrrd ar age : 
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Outcome - 174 atOr, SyStermn Gn g AiSea$ts, pIee . 
2. How did the searchers ensure that the exposures preceded the outcomes? | 
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3. If you re-design the alsove study into a prospective cohort study, can you think of any 
advantages and disadvamia es that you may have? 
A dyontag Ce. : / 7 Ay Gahvantesey, . 
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_ It should be clear to you that in_both types of cohort studies, the study begins with 
exposed and non-exposexi_ groups. However, in-retrospective cohort studies, all relevant 
events (both exposure to potential risk factor and disease outcome) have already occurred 
when the study is initiated. In contrast, in_prospective studies, the exposures have 
occurred at the time the study is begun but the disease outcomes have certainly not yet 


occurred and therefore, the participants need to be followed up into the future to assess 





the incidence rates of the «lisease outcome. 







Terms ‘prospective’. and ‘retrospective’ 
are used 1nly in relation to the timing of data collection and 


not to refer to a particular study design type. 
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a. Identify a suitable cohort 
For a cominon exposure such as cigarette smoking and betel chewing, a large 
exposed persons could be identified from the general population (community cohort), 
For rare ¢xposures such as certain occupations (e.g. asbestos, cinnamon jpg : 
medical therapy or procedures (e.g. chemotherapy, x-ray treatment), envirop,, 

: hazards (e.,3. high tension wires, Atomic bomb in Hiroshima), it is necessary to identity 3 
group who have undergone that specific exposure or experience. Advantages of 4 


cohorts inc ude: 
- It allows collection of a sufficient number of exposed persons within a short time 


- It leads to identification | of aetiological agents in special circumstances : 
- It allows more complete and accurate information on exposures and good compliance ay 
follow up. For example, Doll and Hill (1950) utilised British Doctors as the cohort be! 


studying s:noking and lung cancer because of their ability in providing accurap 
information about their smoking habits. : 


b. Select an appropriate comparison group 
The choice of a non-exposed group is crucial and difficult. Ideally, the non-exposed i 
up with respect to all other — 


group needs to be as similar as possible to the exposed gro 
factors that are related to the disease e except the exposure under study. . | { 


In a single general cohort, an internal comparison group can be utilised. For example, 
experience of the cohort members classified as having an exposure is compared with that 
of member: of the same cohort who are either non-exposed or exposed to differeat 
degrees. In contrast, in some occupational settings, there might not be a comparison group ’ 
that could be definitely identified as non-exposed. In this instance, a external 
comparison group could be utilised. For example, in a study assessing the effect of qt 
dust on resoiratory diseases, an appropriate comparison group for @ cohort of qatta 
workers was selected from hospital labourers. This study was further strengthened "7 
including multiple comparison groups from different occupational settings (¢-8: om 
textile workers, rubber tappers). This is very useful especially when no single aes "I 
comparable with the exposed group. ’ 
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c. Collect data on exposure and subsequent outcomes 
Data on basic characteristics of the cohort, exposure onl subsequent disease outcomes 
need to be collected. Information on exposure status could be obtained from a number of 
sources. Some of these include: surveys with follow up procedures/ interviews, medical 
and employment records monitored over time and periodic medical examinations sud 
interviews. 


The use of pre-existing records offers a number of advantages as well as disadvantages. 


® 
State one advantage and disadvantage in using pre-existing records. 
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For a cohort study with fatal endpoints, outcome data can be obtained from death 


certificates. It is the most reliable method for assessing all-cause mortality as the outcome 


in a study but not so much for mortality from a specific disease. For non-fatal end points, 


outcome data can be obtained from physician’s records, BHT, hospital registers, etc. 


Additional information such as pathology reports, hospital records can be used to confirm - 


the diagnosis. 









Whatever method is used for identifying disease outcomes, 


they should be equally applied to both exposed and non-exposed groups. 


d. Approaches to follow up 
Whether prospective or retrospective, in any cohort study, the collection of outcome data 


involves tracing or following up the cohort members from the point of exposure to the 

point of disease iaiecene. This poses the biggest challenge in conducting a cohort study. 
It is also the main reason for its high cost in terms of money and time. In general, the 

longer the duration of follow up, the more difficult it will be to achieve complete follow 
up because the, cohort members are more likely to move, to change jobs, fo change names, 
etc. 


In 4 cohort study, disease outcome data is collected from both exposed and non-exposed groups 
via interviews, @ questionnaires and examination of records. Since this information is very crucial in 
drawing conclusions shout sssociations, we need to ensue tht we donot introduce any bis during 
data collection. 


What is bias? 
It is an error made in epidemiological studies due to known sources of variation resulting in an 
incorrect estimation of the association between an exposure and the risk of disease. Usually, this 
error will distort study findings in one direction. | 
They are mainly of three types due to differences in the way: 

~ study subjects are selected into a study (selection bias) 

- information is reported by the study subjects (recall bias) or 

obtained or interpreted by the researchers (interviewer bias) or 


loss of participants to follow up (loss to follow up) 


In a cohort study, bias can be introduced mainly due to loss-to-follow up. Since both exposure 


and outcome have occurred at the beginning of a'study, recall bias could influence the 


classification of exposures_in retrospective cohort studies. Since exposure is measured 
prior to the disease outcome, it is unlikely that such bias could influence the classification 
of exposures in prospective cohort studies. However, interviewer bias could be 


introduced while measuring disease outcomes in both types of cohort studies. 


sh Analysis and interpretation 
Once the data is collected, the relationship between the exposure variable and outceme 


‘can be presented in a two by two table, as shown below. 


Table 1. Relationship between the exposure and outcome in a cohort study 


Sdn okllad 
a See ath 
N = atbt+ct+d | | | 
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Using information in Figure 2, 
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1.2 Write the hypothesis for the association that needs to be tested in this cohort study. 
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1.3 Test the given hypothes!s. 
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‘What you have done in 1.3 is testing a hypothesis for a cohort study using a statistical 
approach. |.et us now consider testing the same hypothesis using an epidemiological 
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We make use of this measure Of disease frequency to assess the effect of an exposure on a 
disease, It is by calculating a ratio of the incidence rates of a disease in the exposed and 
the non-exposed Broups. This ratio is called the relative risk (RR) or risk ratio. 


If the incidence of disease in the exposed group is denoted by I; and the incidence of 
disease in the non-exposed group by Io, this is how we can write the RR: 


Relative risk (RR) = J, 
: lo 
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e Calculation of RR 
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i. For a cohort study with count data, the relative risk is calculated as the ratio of the 


cumulative incidence of a disease in the exposed and the non-exposed. groups. 


} 












Relative risk (RR) = Cumulative incidence rate of disease in the exposed group (C) 


Cumulative incidence rate of disease in the non-exposed group (CI,) 





Complete the following equation for relative risk using symbols given in Table 1. 


RR = a/ (5...) 


ce / (63.4...) 





1.4 Calculate the RR for QI in Exercise 1. en 
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ii, Let us consider a cohort study with person-time follow-up data, as shown in Table 2 


Table 2. 2. Presentation of data from a cohort study with person time data 
SS 
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#" Please refer to volume | pages 13-14 to understand person-time data. 


For a cohort study with person-time data, the relative risk is calculated as the ratio of the 


incidence density of a disease in the exposed group and the non-exposed group 


Relative risk (RR) = Incidence density of disease in the exposed group (ID) 


Incidence density of disease in the non-exposed group (IDo) 









Complete the following equation for relative risk using symbols given in Table 2 





Q2. A cohort study was carried out among postmenopausal female nurses to investigate 
the relationship between postmenopausal hormone use and coronary heart disease (CHD) 
After a total of 54,308.7 person-years of follow-up, 30 women who reported use of post 
menopausal hormones developed CHD. Among the non-hormone users, 60 developed 


.CHD after 51,477.5 person-years of follow-up. 


2.1 Write the 2x2 table for the above data, 
Fs rulisara Steed. _._.Ou come 4 


Perwo hme unis 


abuse 


a me om whee 2 ae ee ee 


Ce ee 7 Pintene Whiten 
2 EEN EES Ranier Pee 
teoe ! Oo eee nenties An 4.5 eatin 


He AP OS Oe al cde PCA OV" Ts BSS 


SP A SEE AS NR NE EN AES hy LY OS OD NE AP a wa A He eed ied 


2.2 Calculate the RR 
maladati : ee 
oh ty te [5A309.% -a-05 2 OF. 
ght ca, EE Rofsiaens 0-00 
PERC aie ieee “<th scot no Se eee a 


el Sm 


7 «on 
an 
ve a te cree a 
OPES OO ae eens ee 
~ SO CPOS6S 6) = erepesen SP A OO 
ir eee 


Interpretation of relative risk : 


R . : einige 
Clative risk estimates the mapnitude or stren gth of an association between an 
cx : : 
POsure and a disease, {f indicates the likelihood of developing a disease in @ group of 
People exposed to a Poteritial risk factor relative to the non-exposed group and therefore 


as : 
sessing the likelihood of an association representing a causal relationship. 








RR is in indicator of the strength of an association 
: between an exposure and a disease. 
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Let us now see how the numerical value of RR could be interpreted. 


eo A relative risk of 1.0 indicates that the incidence rates of disease in the exposed and non- 
exposed are identical. In other words, the likelihood of developing a disease in a group of 
people exposed to a risk fisctor relative to a group not exposed to it is equal. 









If the relative risk is 1, 
there is no assoviation between.the exposure and the disease outcome. 


-~ A relative risk greater tlam 1 indicates that the incidence rate of disease in the exposed 
is higher than that of the non-exposed group. In other words, the likelihood of developing 
a disease in a group of pcople exposed to a risk factor is higher than that in a group not. 
exposed to it. : : : 
_if the relative risk is> 1, “ 
there is a positive association or an increased risk of disease 


among those exposed to grisk factor compared to the non-exposed. 
————— 
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Using this knowledge, 
1.5 How do you interpret the RR that you obtained for Ql in Exercise 19 
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We say that OC users had 1.4 times the risk of developing bacteriuria fe : ~, 
OC users or OC users were 1.4 times more likely to develop bacteriur; Pared to 
la Com 


non-OC user. red tp : 
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We have obtained a relative risk of 0.5 in the study given in Q2 in Exercise }, It indicates ; 
that the women who used post-menopausal hormones had only half or 0,5 times the tisk : 
of developing CHD compared with non- post-menopausal hormone users or women a ! 
used post-menopausal hormones’ were only 0.5 times more likely to develop CH 


compared to non-hormone users. 


5.4 Attributable Risk : 

Another wa‘y of testing a hypothesis in a cohort study using an epidemiological approach 
is by comparing the difference ‘in disease: occurrence between the exposed ‘and non- 
" exposed groups. This difference is called the attributable risk (AR), excess risk or 
absolute, risk. It is the difference in the incidence rates of disease between the exposed ' 


‘ 






and non-exposed groups. 





This is how we write the AR. 





Attributable risk = Incidence in the exposed (1,) - Incidence in the non-exposed (™ 
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e Interpretation of attributable risk 


Attributable risk estimates the absolute effect of an exposure or the excess risk of a 
OUR Res 


disease in the exposed group compared to the non-exposed groups. The AR is therefore 
used to quantify the risk of disease in the exposed group that is attributable to the 
- exposure itself after excluding the risks due to all other potential causes other than the 


exposure under study that could have led to the disease outcome. 


AR is an indicator of the risk of a particular disease 


attributable to the exposure itself. ~ 





It is noteworthy that: the interpretation of AR is valid only if a cause-effect 


relationship exists between the exposure and disease under study. 


Let us now see how the numerical value of AR could be interpreted. 


An attributable risk of 0 indicates that there is no difference in the incidence rates of 


disease between the exposed and the non-exposed groups. 


If the attributable risk is 0, 


there is no association between the exposure and the disease outcome. 





ciation between the exposure and the disease, an attributable risk 
e rate of a disease in the exposed that can be 
the incidence rate of a disease 


If there is a causal asso 
greater than 0 indicates the incidenc 
attributed to the exposure itself. Alternatively, it indicates 
~ in the exposed that could be eliminated if the exposure was removed. 







a 






If the attributable risk is > 0, 
there i is an excess risk of disease among the exposed 


that can be atirioyenbre to that exposure. ms 
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e What is the importince of RR and AR for a medical person? 
* RR and AR are useful in measuring the associations between exposures and disease 


outcomes and thereby assussing the risk of developing a discase ina population. 


| RR and AR are called ‘measures of associations’. | 


It is important to note that although both RR and AR are measures of associations, these 
provide very different infiirmation about the risk factors of diseases: 


e RR provides information about the strength of an association between an exposure 
EE ee 
and a disease outcome 





¢ Assuming that there is a causal relationship between an exposure (risk factor) and 
the outcome (disense), AR provides information about the extent of the public 
ee 


health impact, if the exposure is removed from the population-at risk 


In the classic study that was carried out among British male physicians to assess the 
relationship of cigarette smoking (exposure) with lung cancer and CHD deaths 
(outcomes), the following results were obtained: 
For lung cancer: RR = 14; AR = 130/10°/year 
For CHD: RR = 1,6; AR = 256/10°/year 





Ref: Doll R and Peto AB. A study on the aetiology of carcinoma of the lung, 1952.Br Med J. 2:127 
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A rdingly we can conclude that cigarette smoking is 4 much stronger rigy acto, 
0 > ‘ * ‘ 
cc : pared to CHD. However, if smoking is causally relay, me: a 
a 


dying of lung cancer com : : 
diseases, prevention of cigarette smoking in the population would prevent far mor, deat 
from CHD than from lung cancer among the smokers in the population, Therefore ‘i 


public health impact of preventing smoking in the coramunity will be far greater f,, Ch 


than for lung vancer. 
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5.5 Advantages and disadvantages of cohort studies 


Advantages: | 


e Provides a complete description of experiences subsequent to an exposure such 


as incidence rates and natural history of diseases and their staging. 


YS 
Provides information on more than one disease related to the same risk factor. 





Helpful for determining whether there is a cause - effect relationship since the 
risk facto: precedes in time the occurrence of disease. 


e Both relative and attributable risks can be calculated. 


_ Disadvantages: 
e Cohort studies. are long term and therefore are very expensive, time consuming and 


not alwa‘/s feasible, | | 
Large numbers need to be followed up and are unsuitable for _ studying rare 


diseases. It is usually difficult to find and manage samples of this size. : 
¢ The most serious problem encountered is loss to follow-up of participants or i 
interviewers. This can affect the validity of the conclusions and may make the ! 
samples ‘ess representative of their source population. 
Over thi: period of observations, there may be changes in the exposure status of 4 
the subjects. Many changes may occur, which may influence the relationship between , 






exposure and disease and may confuse the issue of association. : | : 
* The study itself may influence the behaviour of persons under investigation in suc : 

way that it may influence the development of the disease. . 4 
* Serious ethical issues may arise with apparent disease excess before data completion. 
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6. Case-contro] Studies 


In this study design, we divide the Subjects j 
absence of a disease following an exposure to 
e Group with the disease outcome 


* Group without the disease Outcome 


nto two groups based on the presence or 
& potential risk factor i.e. 


< 2 se a is designed. As illustrated in Figure 1, subjects 

: of whether they do (cases) or do not (controls) have 
the — of interest. History of exposure is then obtained from both cases and controls, 
Finally, the association between the exposure and the eh tl is studied by comparing 
the exposure pattern of the cases with the exposure pattem seen in the controls. 


Figure 1: Design of a case-control study 
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As you can see, unlike a cohort study that begins with exposed and non-exposed groups 

and proceeds to assess their outcomes, a case-control study begins with two groups of 

cases and controls and proceeds to assess their exposures. Therefore, essentially in case- 

control studies, all relevant events (both exposure to potential risk factor and disease 
outcome) have already occurred when the study is initiated. 

‘ This study design is also called ‘retrospective study’ since the investigator is looking 
beshewerd fron disease outcomes to possible exposu " But this term should not be used 


* 


Se ‘ 7 
T color to a case-control study. As we learnt in chapter 2, prospective and retrospect: 





terns should be reserved only to refer to the timing of data collection of ony mat 
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Give rj 
iven below is a Classic example of a case-control study. 


: Case-Control Sted 
Since : y 
mramuscular (IM) vitamin K is given routinely to newborns in the United 


States, . 
aa ware of studies reporting a doubling in the risk of childhood cancer among 
ad received IM vitamin K caused quite a stir. To investigate this 


association further, German investigators carried out a case-control study, The basic steps in 
performing this Study were to: : 


_ L Select the sample of cases—107 children with leukaemia from the German Childhood 
Cancer Registry were selected. - 


2. Select the Sample of controls—107 children matched by sex and date of birth and 
randomly selected from children living in the same town as the 


Case at the time of diagnosis (from local government residential registration re- 
cords) were selected. 


3. Measure the predictor variable—rovicwed medical records to determine 
which cases and controls had received intramuscular vitamin K in the new- 


The authors found that 69 of 107 cases (64%) and 63 of 107 controls (59%) had been 
exposed to IM vitamin K, for an odds ratio of 1.2 (95% confidence interval [CI], 0.7 
to 2.3). (See Appendix 8.A for the calculation.) Thus this study did not confirm the 
existence of an association between the receipt of IM vitamin K as a newborn and 
subsequent childhood leukemia, although the point estimate and upper limit of the 
95% Cl leave open the possibility of a clinically important increase in leukemia.* 


Ref: Von Kries R, Gobel U, Hachmeister A, Kaletsch U, Michaelis J. Vitamin K and 
childhood cancer: a population based case conrol study in Lower Saxony, Germany. 


6.1 Essential steps in carrying out a case-control study | 
There are 4 essential steps in carrying out a case-control study. They are, 
a. Select cases and controls 
b. Match cases and controls 
c. Measure the exposure status | 
d. Analysis and interpretation | 7 7 | sen eee. 


a. Select cases and controls 
Comparability of cases and controls is essential in a case-control study, which: would 





enable the researcher to conclude that the disease among the cases was most likely due to 


the exposure under study. Cases and controls should be comparable by their baseline risk 
of developing the disease other than from the exposure under study and also by the 


accuracy and completeness of exposure data. 


£1 





© Cases | | 
Cases should represent a dist:ase entity as homogenous as possible. For example, hepalitis 


A and B have very different aetiologies and it would be wrong to consider all hepatitis 


infections’ as cases in a study. Therefore, it is important to identify cases using @ Case 


definition. 


_ Case definition 
Often in practice, individuals who fit'into the definition of a case, in a particuler setting, 


identified within a specific period of time are included as cases. Relevant information 
about the diagnosis can be collected using questionnaires, direct questioning and exiting 


records such as bed head tickets, diagnosis cards and investigation reports. 









A case definition should be clear and unambiguous 


with criteria for inclusion and exclusion from the study 





2 
@ Develop a case definition sv? ene: Moye Hen Yokg/m™ 
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y alles of cases 
Cases can be selected from a number of sources. Such sources may include: 


cases attending a «linic or admitted to/discharged from a hospital within a 


specified period of time 
cases identified in a community survey or surveillance program at a specified time 


Whatever the source thal you may use, ‘newly diagnosed’ cases are pn=ferred to 
‘prevalent’ cases for some wf the reasons given below: | | 
- Prevalent cases can have factors that would have enabled them to survive, and not 
necessarily the risk fixctors. Incident cases have risk factors at the time of diagnosis. 
Prevalent cases will give false information about behaviours related to the advice 


given, and not really the type of behaviour they have had at the time of diagnosis. 


Pron oe think of a disease for cng. obtaining prevalent cases is unavoidable? 
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« Controls 
. . 4 most difficult tank in the design of a dase-control atid ce 
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appropriate cot rol group. 
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Control definition 
Controls should be persons who would have been identified aa ‘eases’ had they not de 
: oo avo Neg 


the disease. Therefore, any exclusions made in the identification of cases should 9) 
alse 
applied equally to the controls. " 
Sources of coitrols 
There is no control group that is optimal for all situations and may differ according the 
J 


scenarlo, Given below are some examples for different sources of controls. 


* Hospital controls 
In evaluating the association of cigarette smoking with myocardial infarction, cases were 


identified from admissions to coronary care units of selected hospitals, Controls were 
selected from admissions to surgical, orthopaedic and medical wards of the same hospital 


who presented with musculo-skeletal diseases, trauma and a variety of other non-coronary 
conditions. ComW vod populalion fare y nA dOboeathvad ,» Ciletiwen 





@ 
Select a control group appropriate for a study that assesses physical exertion as a risk 
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Hint; Cases were women who had an abortion by 20 weeks of gestation.and with a 
pathology specimen analyzed by one of the pathology departments in a health area over 4 


given period cf time. 
Advantages of having hospital controls include: | 
~ They ere easily identified: readily available; and more willing than those ae 
- They wre more likely than healthy controls to_recall exposur& events “a 
because they are hospitalized and ill and therefore comparable to cases. 
~ When controls are identified from the same hospital as cases, they have bed | 
same :election factors that influenced the cases to come to that particular hosp) 
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One disadvantage of having hospital controls is that the disease for which the controls are 
hospitalized may be associated with the risk factor under study. Por example, including 
patients with bronchitis and pneumonia as controls in a study assessing the relationship 
between cigarette smoking and lung cancer would underestimate this relationship. It is 


because smoking is a risk factor for all three diseases and therefore different from the 


smoking habit of healthy population, 


Healthy controls are selected when it is not desirable or feasible to select controls from 
hospitalized populations. However, they may pose difficulties such as: 
-  Itis often expensive and time consuming. 
They are usually busy; are difficult to meet; and are less motivated to participate. 
- . Those who volunteer to participate may be very different to the general population. 
- ‘They may not recall exposure with the same level of accuracy as hospital controls. 


Given below are some examples of healthy controls. 


¢ General population controls 
{n evaluating the risk factors of acute lymphoblastic leukaemia (ALL), cases were children 


aged 0-9 years with ALL diagnosed from tertiary care centres between (1980-1993) in 
Québec, Canada. Controls were children from family allowance files who were matched 


for age, sex, and region of residence at the time of diagnosis of matched cases. 


* Controls from the same source population of cases 

In evaluating the association of electro-magnetic radiation with leukaemia, cases were those 
who worked in an electric utility company whose underlying cause of death was 
leukaemia, Controls were selected from workers of the same company who were alive on 


the date of death of the index case. History of exposure to electro-magnetic radiation was 
determined for both groups using company job history information. 


* Family, friends or neighbourhood controls — 
In evaluating the association of menstrual cycle pattern with ‘endometriosis, cases were 


women aged 15-49 years who were newly diagnosed patients of endometriosis confirmed by 
laparoscopy and attending a specialist clinic during a specified period. Bach woman was asked 
to provide names of four friends who were not biologically related, were not patients of the 
particular specialist clinic: were not known to be having endometriosis; and were within two 
years of their own age. From this pool of friends, controls were randomly selected. 
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Family, friends or neighbourhood controls may be more co-operative than other controls because 
of their interest in the cases and may also offer a degree of similarity in relation to their lifestyles, — - 
ethnicity, socio economic Status and environment. 


v 
Ce Think of an exposure for which family members or friends are also likely to be exposed as 


the cases, so that it leads to an under-estimation of the true effect. 
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b. Match cases and controls 


Many risk factors and diseases are related to ape, Sex, etc, Study results of a case-control 


Study may not be meaningful if the two groups are selected differently in these variables. 
For example, in a study of exercise and risk of myocardial infarction, factors such as age, sex and 
Smoking are also associated with ML If we include mo 1 smokers as ¢ 

to an over-estimation of this association, A simp 
matching cases and controls. 


ore smokers as cases in this study, it could lead 


le method that eliminates this problem is 


iV 

1. In a case-control study that evaluates the association of exercise with myocardial 
infarction, what other factors are more likely to distort the findings of this study? 
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2. Describe how you would overcome this by carrying out matching in this situation. 
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c. Measure the exposure status 
_ In a case-control study, exposure data is collected from both cases and controls via interviews, 
questionnaires and exmnination of records. Since this information is very crucial in drawing 
- Conclusions about associations,.we need to ensure that we do not introduce any bias during data 


collection (Refer bias in page 10). : 


Bias can be introduced into a case-control study either by the participants or by the researcher 
due to differences in the way: | 

- cases and controls are selected from different settings (Selection bias) 

~ . cases and controls recall exposure information (recall bias) or 


interviewers report or interpret information (interviewer bias) 
| oF | 
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: ‘nimi bias: 
Given below are some jirecautions that must be taken to minimize such ae 
& : . ad Ss we =  nnaemign ene ord 
- Exposure should le measured using an objective method such as Won > — . 


survey methods, questionnaires and procedures formation should be 
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* Questionnaires and procedures that are used to gather exposure into 
fed te Is. . 
uniformly applied tt» both cases and controls. t might bias 


: as |i 
* Neither case nor control should be aware of the hypothesis under stud / 


the answers given |yy the participants. 


d. Analysis and interpretation 
; expo 
Once the data is collected, like in cohort studies, the relationship between the exp 


variable and outcome c:an be presented in a two by two table, as shown below. 


sure 


Table 1. Relationship between the exposure and outcome in a cohort study 






Outcome status 


: at+b 


| Exposure status 





N = a+b+c+d 


oe 


In a case-control study investigating the association between cigarette _smoking and lung 
cancer, it was found that of the 518 cases of lung cancer 499 were smokers. An equal 


number of controls were selected and of these, only 462 were smokers. 


|.1 Write the 2x2 table. 
| FePusture stole) 
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What you have done in 1.2 is testing a hypothesis for a case-control study using a 


Statistica! approach Let us now rane testing this hypothesis using an 
epidemiological approach. ; 


6.2 Odds Ratio 


Recall how you tested a hypothesis in cohort studies by calculating a relative risk RR 
and attributable risk (AR) (pages 13-17). 


1.3 Can you calculate RR and AR for the association in Exercise 1? 
eo, We cant can ‘6. 


The answer is NO. 
_ You may recall! that one needs to know the incidence rate to calculate the RR or IR. Ina — 
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S re cen the rate of development of a disease because the information on 
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However, the RR can be estimated by calculating the ratio of the odds of exposure among 
the cases to the odds of exposure among the controls. Let us see how this is derived. 


2 — If we consider a large population-at-risk, the number of people with a particular disease is 
likely to be very small compared to the number of people without that disease. Then, it 
follows that (refer to table 1): | | 

* (a + b) will closely approximate b and 
sk (c +d) will closely approximate d 


Accordingly, we can re-write the equation for RR as: 
| 3p Relative Risk = a/b 
c/d 
= ad 


c*b 





We call this the Odds Ratio (OR). 


Odds Ratio (OR) =ad 














be. 
F 1.4 Calculate the OR for the above exercise. 
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It should be clear to you now that in a case-control study, we can only calculate OR and 


-__ not RR nor AR. However, the OR is interpreted in the same way as the relative risk. 
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: 1.5 Comment én the result. 
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6.3 Advantages and disadvantages of case-control studies 
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Advantages: : | 

* Ideal for identifyin g risk factors for rare diseases and also for diseases with a long 
——- ecg Sr aE ee 

i od, 


bation 
* Relatively efficient requiring a smaller sample than a cohort. 
* Relatively cheap. | 


* Can obtain results relatively quickly. 

* — Attrition (loss to follow up) is not a problem. 

Can investigate a wide range of possible risk factors. 

. Consistency of measurement techniques can be easily maintained. 

sometimes, this is the moet feasible observational stategy for examining an association. 
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Disadvantages: : 
* — Possible bias in selecting cases and contro controls (selection bias). 
Possible bias in measurement of exposure (recall and interviewer bias) and 


difficulties i in obtaining the necessary information. 
There is no epidemiological denominator (population at risk) and therefire calculation of 


incidence rates is not possible. ey 
° a cs vaete to isuensaneni ita a who died 
__ and those who survived i.e. Selective survival operates in case-control studies: 
It is not possible to find out about the pathology of other diseases related to the risk 


factor under investigation. 
ng 


(temporal sequence) i.e. whether the exposure led to the disease or vice versa. 
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7. Cross-sectional! design in analytical studies 


7.1 Design of a cross-sectional study 
You may recall the cross-sectional surveys that we described under descriptive studies in 
volume | (pages 36-38). se cross-sectional design measures the prevalence of s disease 


(ic. existing cases) and is useful in assessing the distribution of a disease and for 
formulating hypothesis on ciseases and their exposures. 


Cross-sectional design can also be uses examining ing associations in q_analyncal study 


that measures disease and exposure at the same point in time. Given below is the design 
for such a study. 


_ Figure 1: Design ofa cross-sectional study 





Population 





Figure I illustrates the design of a cross-sectional study. As shown, the individuals 
surveyed in this survey will fall into four categories. Comparison between the group with 
determine if @ given exposure is associated with the disease under study. 
Relationship between the exposure variable and disease can be presented in 2 2x2 table, 
as shown in Table |. | 
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rutile 1 atation of data observed in a cross-sectional study in a 2x? table 
a . Prese: 


Outcome 





7.2 Advantayses and disadvantages of cross-sectional analytical studies 
A major strenth of cross-sectional studies over cohort studies is that there is no Waiting 

Se ae 
for the outcorne to occur, Therefore, it makes them more feasible and less costly, Jp 


addition, it is the only that gives the prevalence of a disease or risk factor. 


However, there is one major weakness in this study design. Since both the exposure and 
disease status are measured at the same time, it makes it difficult to determine whether the 


_ exposure care before the disease or after the disease. This is clearly shown in the 
following example. | 


Figure 1. Hypothetical illustration of the interrelationship between and occupational exposure and 
pre valence of disease as measured by a cross sectional study 
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|. Calculate the prevalence rates of respiratory symptoms in the two jobs 


at points X and Y. . 
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2. Calculate the ratio of prevalence rates between job A and job B at points X and Y. ) 
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Based on the ratios of prevalence rates that you calculated at point Y, we may conclude 
me that job B is more hazardous than job A. However, it is not so as itis because of te 


ie movement of affected workers from job A to job B. Therefore, being in job B would be 


the effect of those syniptoms that they developed while in jo A and not the 


when one uses a cross-sectional design 
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However, under special circumstances this study design can be used as effectively as ina 
prospective cohort study to test hypothesis on disease associations. That is, when the 
Cxposure variable docs not change over time. Such variables include factors present at 


birth such as skin colour, eye colour, blood group and factors such as sex and highest level 


of education. 
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¥. Cross-sectional designs are more appropriate 
for measuring relationships between permanent characteristics of individuals 


and chronic diseases or stable conditions. 
CS 
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Cross sectional study designs are also impractical for the study of rare diseases, 
conditions of short duration and diseases with high case fatality, which are not detected 


by the one-time ‘snap shot’ of a cross-sectional study. 


In a cross-sectional survey of nutritional status among new school entrants, the heights of 


_ 853 children (558 males) were measured. Jt was found that 5] males and 41 of the female 


were stunted according to the Waterlow classification. 


1.1 Write the 2x2 table. 
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12 Test the hypothesis that female students are at greater risk of stunting than males. 
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73 Chi-Square Test 


In Chapter 2, the data in Ql, Exercise 1 was presented in a table (these tables had 4 cells 
and was referred to as a 2x2 contingency table). In this exercise, we looked at the 


association of two events such as the exposure and outcome in two independent 
samples. We also used a statistical approach to find the significance of this association 


_—9 
between the two events. This zpproach w was by way of aomneing two proportions using 


the Z test or the SND test. 


‘When a comparison has to be made between two events but for more than 2 
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independent samples, the the Z test or the SND test cannot be applied. ee sagrcsonae 


bisileiical tent vv the fame of e clii-square test (x’) has to be ny 
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Let us consider three indepenclent groups or. samples of patients all suffering from a 
particular disease undergoing three different treatments T}, T2 and T3 to see how they 
tespond to each treatment. If 1(10 patients given T), 70 (70%) responded favourably, 200 
patients given T2, 60 (30%) responded favorably and another 200 patients given T;, 100 
(50%) responded favourably, how do we find out the association between treatment and 


response to treatment? 


\ 


In order to test this hypothesis on the association between treatment and response, we 


need to compare the proportions of patients responding favourably to these three 
treatnents. In other words, a cornparison has to be made between these three proportions. 


In this instance, we will appl the chi-square test to assess the significance of this 
association. Let us come back to this problem at a later stage in this chapter. 


74 





Let 11s first apply the chi-square statistic for a two sample situation, aa given below. 


To find the association between gestational diabetes mellitus (GDM) and the Parity of 
women, an obstetrical study was conducted among 790 expected mothers of 30 years of 
age. Of these, 480 were in their first pregnancy with 30 of them having GDM, while only 


12 of the remaining pregnant women having GDM. Is there any association between 


gestational diabetes and the parity of women? 


To apply the chi-square test to this above situation, the following steps have to be 


followed. 


i, Complete the table given below using the information in Exercise 3. These values are 


callec! ‘observed frequencies’. 


Table 2. Observed values for the association between GDM and parity of women 


Outcome Total 


[30 {a0 age at 





ii, The chi-square test is based on calculating a set of expected values for each cell in the 
above tuble. The expected values are based on the assumption that the null hypothesis 


of ‘There is no association between the study groups in relation to the proportion of 


"the factor of interest’ is true. If there was no association between the exposure (parity) 
.and disease (GDM), the proportion of cases that were exposed would be the same as the 
proportion of the entire Study population that is exposed, 


\ 


If we want to calculate the expected number of primi-parous women with GDM (a), it 
would be = [(a+b) / N] * (atc) : 


: ao 
Refer Table | in Chapter 2 (page 10) for these symbols. 
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